AITopics

2605.29272

Genre: Research Report (0.50)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.45)

arXiv.org Machine LearningMay-12-2026

Proximal Path-Specific Inference

Bai, Yang, Wu, Sihan, Sun, Baoluo, Cui, Yifan

Mediation analysis (Robins & Greenland 1992, Pearl 2001, Imai, Keele & Tingley 2010, Tchetgen Tchetgen & Shpitser 2012) provides a principled framework for investigating causal mechanisms by decomposing the effect of a treatment A on an outcome Y into pathways operating through a mediator of interest M. Classical mediation analysis focuses on the natural indirect effect, corresponding to the pathway from Ato Y through M, and the natural direct effect, corresponding to pathways not through M. These estimands are well understood when a single mediator is present and strong identification assumptions hold. However, in many applications, there exist multiple intermediate variables between treatment and outcome. In such settings, conventional mediation analysis typically requires the absence of treatment-induced mediator-outcome confounders--often referred to as recanting witnesses--as well as the absence of unmeasured confounding. Under these circumstances, commonly used identification assumptions such as sequential ignorability (Imai, Keele & Yamamoto 2010) or nonparametric structural equation models with independent errors (NPSEM-IE) (Pearl 2009) no longer suffice to identify natural indirect effects (Avin et al. 2005, Tchetgen Tchetgen & VanderWeele 2014). Figure 1 illustrates this issue: the recanting witness D is directly affected by A and simultaneously confounds the relationship between M and Y. Such treatment-induced confounding is common in epidemiologic studies, particularly when the mediator of interest occurs long after the treatment initiation (Robins 1999). A motivating example arises in studies of preterm birth. Mediation analysis has been widely used to explore whether adequate prenatal care (A) reduces the risk of preterm birth (Y) through preeclampsia (M) (Vansteelandt & VanderWeele 2012, VanderWeele et al. 2014, Xia & Chan 2023).

artificial intelligence, estimator, machine learning, (16 more...)

2605.09462

Country: North America > United States > California (0.28)

Genre:

Research Report > Strength Medium (0.48)
Research Report > Observational Study (0.48)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Public Health (1.00)
Health & Medicine > Epidemiology (1.00)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.90)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.68)

Kosko, Matthew, J, Falco, Bargagli-Stoffi, null, Wang, Lin, Santacatterina, Michele

Fast Uncertainty Quantification for Kernel-Based Estimators in Large-Scale Causal Inference

arXiv.org Machine LearningMar-17-2026

Kernel methods are widely used in causal inference for tasks such as treatment effect estimation, policy evaluation, and policy learning. The bootstrap is a standard tool for uncertainty quantification because of its broad applicability. As increasingly large datasets become available, such as the 2023 U.S. Natality data from the National Vital Statistics System (NVSS), which includes 3,596,017 registered births, the computational demands of these methods increase substantially. Kernel methods are known to scale poorly with sample size, and this limitation is further exacerbated by the repeated re-fitting required by the bootstrap. As a result, bootstrap-based inference for kernel-based estimators can become computationally infeasible in large-scale settings. In this paper, we address these challenges by extending the causal Bag of Little Bootstraps (cBLB) algorithm to kernel methods. Our approach achieves computational scalability by combining subsampling and resampling while preserving first-order uncertainty quantification and asymptotically correct coverage. We evaluate the method across three representative implementations: kernelized augmented outcome-weighted learning, kernel-based minimax weighting, and double machine learning with kernel support vector machines. We show in simulations that our method yields confidence intervals with nominal coverage at a fraction of the computational cost. We further demonstrate its utility in a real-world application by estimating the effect of any amount of smoking on birth weight, as well as the optimal treatment regime, using the NVSS dataset, where the standard bootstrap is prohibitively expensive computationally and effectively infeasible at this scale.

artificial intelligence, estimator, machine learning, (17 more...)

2603.13662

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report > Experimental Study (0.93)

Industry:

Education (0.67)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)

Neural Information Processing SystemsFeb-17-2026, 21:00:52 GMT

bdabb5d4262bcfb6a1d529d690a6c82b-Paper-Conference.pdf

artificial intelligence, data mining, machine learning, (18 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(2 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.67)
Government (0.67)
Information Technology (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Data Science > Data Mining (0.67)

Neural Information Processing SystemsFeb-12-2026, 18:36:15 GMT

4d254c96c5ff500dda2ac0a58987aeba-Paper-Conference.pdf

artificial intelligence, machine learning, modeling & simulation, (17 more...)

Country:

North America > United States (0.28)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Middle East > Oman (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Government (0.93)
Health & Medicine > Public Health (0.67)
Health & Medicine > Therapeutic Area > Oncology (0.45)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Neural Information Processing SystemsFeb-11-2026, 12:07:21 GMT

DeepMed: SemiparametricCausalMediationAnalysis withDebiasedDeepLearning

Causal inference is no exception.

artificial intelligence, lasso 0, machine learning, (17 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Shanghai > Shanghai (0.04)
North America > Greenland (0.04)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Neural Information Processing SystemsFeb-10-2026, 21:11:01 GMT

b8102d1fa5df93e62cf26cd4400a0727-Supplemental.pdf

A graph G is a collection of nodes and links connecting pairs of nodes.

algorithm, artificial intelligence, confidence sequence, (14 more...)

Technology: Information Technology > Artificial Intelligence (0.46)

arXiv.org Machine LearningJan-6-2026

Double Machine Learning of Continuous Treatment Effects with General Instrumental Variables

Chen, Shuyuan, Zhang, Peng, Cui, Yifan

Estimating causal effects of continuous treatments is a common problem in practice, for example, in studying dose-response functions. Classical analyses typically assume that all confounders are fully observed, whereas in real-world applications, unmeasured confounding often persists. In this article, we propose a novel framework for local identification of dose-response functions using instrumental variables, thereby mitigating bias induced by unobserved confounders. We introduce the concept of a uniform regular weighting function and consider covering the treatment space with a finite collection of open sets. On each of these sets, such a weighting function exists, allowing us to identify the dose-response function locally within the corresponding region. For estimation, we develop an augmented inverse probability weighting score for continuous treatments under a debiased machine learning framework with instrumental variables. We further establish the asymptotic properties when the dose-response function is estimated via kernel regression or empirical risk minimization. Finally, we conduct both simulation and empirical studies to assess the finite-sample performance of the proposed methods.

artificial intelligence, assumption 2, machine learning, (16 more...)

2601.01471

Genre: Research Report (0.82)

Industry:

Health & Medicine (0.45)
Education (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Neural Information Processing SystemsDec-25-2025, 01:41:20 GMT

DeepMed: Semiparametric Causal Mediation Analysis with Debiased Deep Learning

Causal mediation analysis can unpack the black box of causality and is therefore a powerful tool for disentangling causal pathways in biomedical and social sciences, and also for evaluating machine learning fairness. To reduce bias for estimating Natural Direct and Indirect Effects in mediation analysis, we propose a new method called DeepMed that uses deep neural networks (DNNs) to cross-fit the infinite-dimensional nuisance functions in the efficient influence functions. We obtain novel theoretical results that our DeepMed method (1) can achieve semiparametric efficiency bound without imposing sparsity constraints on the DNN architecture and (2) can adapt to certain low dimensional structures of the nuisance functions, significantly advancing the existing literature on DNN-based semiparametric causal inference. Extensive synthetic experiments are conducted to support our findings and also expose the gap between theory and practice. As a proof of concept, we apply DeepMed to analyze two real datasets on machine learning fairness and reach conclusions consistent with previous findings.

deepmed, name change, semiparametric causal mediation analysis, (3 more...)

Industry: Law > Alternative Dispute Resolution (0.91)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.41)

arXiv.org Machine LearningDec-19-2025

xtdml: Double Machine Learning Estimation to Static Panel Data Models with Fixed Effects in R

Polselli, Annalivia

The double machine learning (DML) method combines the predictive power of machine learning with statistical estimation to conduct inference about the structural parameter of interest. This paper presents the R package `xtdml`, which implements DML methods for partially linear panel regression models with low-dimensional fixed effects, high-dimensional confounding variables, proposed by Clarke and Polselli (2025). The package provides functionalities to: (a) learn nuisance functions with machine learning algorithms from the `mlr3` ecosystem, (b) handle unobserved individual heterogeneity choosing among first-difference transformation, within-group transformation, and correlated random effects, (c) transform the covariates with min-max normalization and polynomial expansion to improve learning performance. We showcase the use of `xtdml` with both simulated and real longitudinal data.

double machine learning estimation, learner, transformation, (11 more...)

2512.15965

Country:

Europe > Austria > Vienna (0.14)
North America > United States (0.04)
Asia > India (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Banking & Finance (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)